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Abstract 

We consider an infinite horizon dynamic mechanism design problem with interdepen¬ 
dent valuations. In this setting the type of each agent is assumed to be evolving ac¬ 
cording to a first order Markov process and is independent of the types of other agents. 
However, the valuation of an agent can depend on the types of other agents, which 
makes the problem fall into an interdependent valuation setting. Designing truth¬ 
ful mechanisms in this setting is non-trivial in view of an impossibility result which 
says that for interdependent valuations, any efficient and ex-post incentive compatible 
mechanism must be a constant mechanism, even in a static setting. Mezzetti (2004) 
circumvents this problem by splitting the decisions of allocation and payment into two 
stages. However, Mezzetti’s result is limited to a static setting and moreover in the 
second stage of that mechanism, agents are weakly indifferent about reporting their 
valuations truthfully. This paper provides a first attempt at designing a dynamic mech¬ 
anism which is efficient, strict ex-post incentive compatible and ex-post individually 
rational in a setting with interdependent values and Markovian type evolution. 

1 Introduction 

Organizations often face the problem of executing a task for which they do not have enough 
resources or expertise. It may also be difficult, both logistically and economically, to acquire 
those resources. For example, in the area of healthcare, it has been observed that there are 
very few occupational health professionals and doctors and nurses in all specialities at the 
hospitals in the UK (Nicholson, 2004). With the advances in computing and communication 

*A preliminary version of this work was presented in the conference on Uncertainty in Artihcial Intelli¬ 
gence, 2011. 
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technologies, a natural solution to this problem is to outsource the tasks to experts outside 
the organization. Hiring experts beyond an organization was already in practice. However, 
with the advent of the Internet, this practice has extended even beyond the international 
boundaries, e.g., some U.S. hospitals are outsourcing the tasks of reading and analyzing scan 
reports to companies in Bangalore, India (Associated-Press, 2004). Gupta et ah (2008) give 
a detailed description of how the healthcare industry uses the outsourcing tool. 

The organizations where the tasks are outsourced (let us call them vendors) have quite 
varied efficiency levels. For tasks like healthcare, it is extremely important to hire the right 
set of experts. If the efficiency levels of the vendors and the difficulties of the medical tasks 
are observable by a central management (controller), and if the efficiency levels vary over time 
according to a Markov process, the problem of selecting the right set of experts reduces to a 
Markov Decision Problem (MDP), which has been well studied in the literature (Bertsekas, 
1995; Puterman, 2005). Let us call the efficiency levels and task difficulties together as types 
of the tasks and resources. 

However, the types are usually observed privately by the vendors and hospitals (agents), 
who are rational and intelligent. The efficiencies of the vendors are private information of the 
vendors (depending on what sort of doctors they hire, or machines they use), and they might 
misreport this information in order to win the contract and to increase their net returns. At 
the same time the difficulty of the medical task is private to the hospital, and is unknown 
to the experts. A strategic hospital, therefore, can misreport the task difficulty to the hired 
experts as well. Hence, the asymmetry of information at different agents’ end transforms 
the problem from a completely or partially observable MDP into a dynamic game among the 
agents. 

Motivated by examples of this kind, in this paper, we analyze them using a formal 
mechanism design framework. We consider only cases where the solution of the problem 
involves monetary compensation in quasi-linear form. The reporting strategy of the agents 
and the decision problem of the controller is dynamic since we assume that the types of 
the tasks and resources are varying with time. In addition, the above problem has two 
characteristics, namely, interdependent values: in a selected team of agents, the valuation of 
an agent depends not only on her own skills but also on the skills of other selected agents, 
and exchange economy: a trade environment where both buyers (task owners) and sellers 
(resources) are present. In this paper, the theme of modeling and analysis would be centered 
around the settings of task outsourcing to strategic experts. We aim to have a socially 
efficient mechanism, and at the same time, that would demand truthfulness and voluntary 
participation of the agents. 

1.1 Prior work 

The above properties have been investigated separately in literature on dynamic mechanism 
design. Bergemann and Valimaki (2010) have proposed an efficient mechanism called the 
dynamic pivot mechanism, which is a generalization of the Vickrey-Clarke-Groves (VGG) 
mechanism (Vickrey, 1961; Glarke, 1971; Groves, 1973) in a dynamic setting, and serves to be 
truthful and efficient. Athey and Segal (2007) consider a similar setting with an aim to hnd 
an efficient mechanism that is budget balanced. Gavallo et ah (2006) develop a mechanism 
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similar to the dynamic pivot mechanism in a setting with agents whose type evolntion follows 
a Markov process. In a later work, Cavallo et ah (2009) consider periodically inaccessible 
agents and dynamic private information jointly. Even thongh these mechanisms work for an 
exchange economy, they have the underlying assumption of private values, i.e., the reward 
experienced by an agent is a function of the allocation and her own private types. Mezzetti 
(2004, 2007), on the other hand, explored the other facet, namely, interdependent values, but 
in a static setting, and proposed a truthful mechanism. The mechanism proposed in these two 
papers use a two-stage mechanism, since it is impossible to design a single-stage mechanism 
satisfying both truthfulness and efficiency even for a static setting (Jehiel and Moldovanu, 
2001). However, the mechanism provides a weak truthfulness guarantee in the second stage 
of the game. A similar result in the setting of interdependent valuations with static types 
by Nath and Zoeter (2013) ensures that the truthfulness guarantee is strict. However, since 
both Nath and Zoeter (2013) and Mezzetti (2004) consider mechanisms that use two stages 
of information realization - in the hrst stage the types are realized and the allocation is 
decided, and in the second stage the valuations are realized by the agents and payments are 
decided - both of them require attention on how the information is revealed to the agents. 
In this paper, we follow an approach similar to Nath and Zoeter (2013) that guarantees 
strict truthfulness. However, the equilibrium concept used here is ex-post Nash because 
we assume agents play in an incomplete information setting, and contrast this with the 
mechanism of Mezzetti (2004). We also discuss how a complete information setting along 
with the equilibrium concept of subgame perfection plays an important role in these results. 
We explain this point in detail while presenting the main result of the paper. 

1.2 Contributions 

In this paper, we propose a dynamic mechanism named MDP-based Allocation and 
TRansfer in Interdependent-valued eXchange economies (abbreviated MATRIX), which is 
designed to address the class of interdependent values. It extends the results of Mezzetti 
(2004) to a dynamic setting, and with a certain allocation and valuation structure, serves as 
an efficient, truthful mechanism where agents receive non-negative payoffs by participating 
in it. The key feature that distinguishes our model and results from that of the existing 
dynamic mechanism literature is that we address the interdependent values and dynami¬ 
cally varying types (in an exchange economy) jointly and provide a strict ex-post incentive 
compatible mechanism. In Table 1, we have summarized the different paradigms of the 
mechanism design problem, and their corresponding solutions in the literature. 


Valuations 

STATIC 

DYNAMIC 

Independent 

VCG Mechanism 

(Vickrey, 1961; Clarke, 
1971; Groves, 1973) 

Dynamic Pivot Mechanism 

(Bergemann and Valimaki, 2010; 

Cavallo et ah, 2006) 

Interdependent 

Generalized VCG 

(Mezzetti, 2004) 

Mechanism MATRIX 
(this paper) 


Table 1: The different paradigms of mechanism design problems with their solutions. 
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Our main contributions in this paper can be summarized as follows. 

• We propose a dynamic mechanism MATRIX, that is efficient, truthful (Theorem 1) and 
voluntary participatory (Theorem 2) for the agents in an interdependent-valued ex¬ 
change economy. 

► This extends the classic mechanism proposed by Mezzetti (2004) to a dynamic set¬ 
ting. 

► It solves the issue of weak indifference by the agents in the second stage of the classic 
mechanism. 

However, we will see that Theorem 1 is true with a restricted domain of subset allocation 
and peer-influenced valuations. These two properties were not needed to achieve a 
similar claim in the static setting (Nath and Zoeter, 2013). We do not know if these 
are the minimal requirements for efficiency and truthfulness, but it is important to 
note that these properties in the dynamic setting do not immediately follow from its 
static counterpart. 

• We discuss why the dynamic pivot mechanism (Bergemann and Valimaki, 2010) does 
not satisfy all the properties that MATRIX satisfies (Section 3.2). 

• We discuss that these results can be extended to a more general setting in Section 4. 

We also discuss that MATRIX comes at a computational cost which is the same as that of 
its independent value counterpart (Section 3.4). 

The rest of the paper is organized as follows. We introduce the formal model in Section 2, 
and present the main results in Section 3. In Section 4, we discuss about a generalization of 
the main results. We conclude the paper in Section 5 with some potential future works. 

2 Background and Model 

Let the set of agents be given by = {1,..., n}, who interact with each other for a countably 
inhnite time horizon indexed by time steps t = 0,l,2,.... The time-dependent type of each 
agent is denoted by 9i^t ^ ©i for i E N. We will use the shorthands 9t = (6*i_f,..., t) = 
{9i^t,9-i^t), where denotes the type vector of all agents excluding agent i. We will refer 
to 9t as the type profile at time t, 9t E Q = Xjg 7 v 0 j. 

The allocation set is denoted by A. In each round t, the mechanism designer chooses 
an allocation at from this set and decides a payment Pi^t to agent i. The allocation leads 
to a valuation to agent i, u* : H x 0 —)■ M. This is in contrast to the classical independent 
valuations (also called private values) case where valuations are assumed to depend only on 
i’s own type; Uj : H x 0* —)■ M. However, we assume for all i, \vi{a, 9)\ < M < oo, for some 
M G M and for all a and 9. 

Stationary Markov Type Transitions, SMTT The combined type 9 t follows a first 
order Markov process which is governed by the transition probability function F{9t+i\at, 9t), 
which is independent across agents, where at is the allocation at period t. 
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Definition 1 (Stationary Markov Type Transitions, SMTT) We call the type tran¬ 
sitions to follow stationary Markov type transitions if the joint distribution F of the types of 
the agents 6t = {Opt-, ' ) o,nd the marginals Fi ’s exhibit the following for all t. 

F{et+i\at, 9t, Ot_i, ■■■ ,6o) = F{et+i\at, Ot), and 


F{0t+i\at, Ot) — '^^Fi{6i^t+i\0‘t^9i^t 


( 1 ) 


i&N 


We will assume the types to follow SMTT throughout this paper. 

For an easier exposition of the more general properties that lead to the same conclusions 
as in this paper, we will restrict our attention to a restricted space of allocations and valua¬ 
tions. In Section 4, we comment on the generalization of our results by introducing certain 
assumptions that subsume the following two assumptions on the allocation and valuations. 

Subset Allocation, SA Let us motivate this restriction with the medical task assign¬ 
ment example given in the previous section. The organizations outsource tasks to experts 
for a payment, where the expert may have different and often time-varying capabilities of 
executing the task. The task owners come with a specihc task difficulty (type of the task 
owner), which is usually privately known to them, while the workers’ capabilities (types of 
the workers) are their private information. A central planner’s job in this setting is to ef- 
hciently assign the tasks to a group of workers. Clearly, in this setting, the set of possible 
allocations is the set of the subsets of agents, i.e., A = 2^. Note that, for a hnite set of 
players, the allocation set is always hnite. So, we can formally dehne this setting as follows. 

Definition 2 (Subset Allocation, SA) When the set of allocations is the set of all sub¬ 
sets of the agent set, i.e., A = 2^, we call the domain a subset allocation domain. Similarly, 
A_j = denotes the set of allocations excluding agent i. 


Peer Influenced Valuations, PIV Even though the valuation of agent i is affected 
by not only her private type but also by the types of others, it is often the case that the 
valuation is affected by the types (e.g. the efficiencies of the workers in a joint project) of 
only the selected agents. The valuation therefore is a function of the types of the allocated 
agents and not the whole type vector. We also assume that the value of a non-selected agent 
is zero. The set of valuations satisfying the above two conditions is called the set of peer 
influenced valuations (PIV). 

Definition 3 (Peer Influenced Valuations, PIV) This is a special set of interdepen¬ 
dent valuations in the SA domain, where the valuation of agent i is a function of the types 
of other selected agents, given by, 

Vi{a,9a) ifiea 
0 otherwise, 

where 9a G Xi^a^i, for an allocation a E A = 2^. 


Vi{a,9) = 


( 2 ) 


The properties SA and PIV together allow for a well-deflned counterfactual social welfare 
in a world where a particular agent does not exist. See also Equation (8). 
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Efficient Allocation, EFF The mechanism designer aims to maximize the sum of 
the valuations of task owners and workers, summed over an inhnite horizon, geometrically 
discounted with factor <5 G (0,1). The discount factor accounts for the fact that a future 
payoff is less valued by an agent than a current stage payoff. We assume 6 to be common 
knowledge. If the designer would have perfect information about the 6ts, his objective would 
be to hnd a policy 'Kt, which is a sequence of allocation functions from time t, that yields the 
following for all t and for all type prohles Of, 


Tit G argmax 

7 


,s=t iaN 


(3) 


where 7 = (af(-), at+i(-),...) is any arbitrary sequence of allocation functions. Here we use 
®" 7 , 0 t['] ~ ■ l^dh] for brevity of notation. We point to the fact that the allocation policy 

7 is not a random variable in this expectation computation. The policy is a functional that 
specihes what action to take in each time instant for a given type prohle. Different policies 
will lead to different sequences of allocation functions over the inhnite horizon, and the 
efficient allocation is the one that maximizes the expected discounted sum of the valuations 
of all the agents. 

In general, the allocation policy tt^ depends on the time instant t. However, for the special 
kind of stochastic behavior of the type vectors, namely SMTT, and due to the inhnite horizon 
discounted utility, this policy becomes stationary, i.e., independent of t. We will denote such 
a stationary policy by tt = (a(-), a(-),...). Thus, the efficient allocation under SMTT reduces 
to solving for the optimal action in the following stationary Markov Decision Problem (MDP). 


W{9t) 


max E,r,et 

TT ’ 


,s=t j€N 


max Ea,9t 

aeA 


^Uj(a,6't) + {9t+i) 

J&N 


(4) 


Here, with a slight abuse of notation, we have used a to denote the actual action taken in 
t rather than the allocation function. The second equality comes from a standard recursive 
argument for stationary inhnite horizon MDPs. We refer an interested reader to standard 
text (Puterman, 2005, e.g.) for this reduction and the general properties of MDPs. We have 
used the following shorthand, E 0 j^qa, 0 t[‘] = refer to W as the 

social welfare. The efficient allocation under SMTT is dehned as follows. 


Definition 4 (Efficient Allocation, EFF) An allocation policy a(-) is efficient under 
SMTT if for all type profiles 9t, 


a{9t) e argmax Eap, 

aeA 


''^Vj{a,9t) + hE0^_^pa^0jlE(6*j+i), 

jeAf 


(5) 
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At time t 


Agents observe 
true types 

ei.t — 

02,t - 

- 


Agents 
report types 


kt 

kt 



stage 1 


Allocation 

a{k 



Agents observe 
true values 

vi{a{0t),dt) — 

V2{a{6t),9t) — 

Vn{a{6t), St) — 


Agents report 
values 





Vl,t 


Stage 2 

V2,t 


Payment 



P(kvt) 




Figure 1: Graphical illustration of a candidate dynamic mechanism in an interdependent 
value setting. 


Challenges in mechanism design with interdependent valuations The value 
interdependency among the agents poses a challenge for designing mechanisms. Even in a 
static setting, if the allocation and payment are decided simultaneously under the interdepen¬ 
dent valuation setting, efficiency and Bayesian incentive compatibility (and therefore ex-post 
incentive compatibility) cannot be satished together (Jehiel and Moldovanu, 2001). In a later 
paper, Jehiel et ah (2006) show that the only deterministic social choice functions that are 
ex-post implementable in generic mechanism design frameworks with multi-dimensional sig¬ 
nals, interdependent valuations, and transferable utilities are constant functions. In view of 
these impossibility results, we are compelled to split the decisions of allocation and payment 
in two separate stages. We would mimic the two-stage mechanism of Mezzetti (2004) for 
each time instant of the dynamic setting (see Figure 1). We consider a direct revelation 
mechanism. In the hrst stage of this two-stage mechanism, the agents observe their indi¬ 
vidual types 6i^t G ©i, i ^ N. The strategies available to the agents are to report any type 
£ ©i- The designer decides the allocation a{k) depending on the reported types k in 
hrst stage. The reported types of the agents are not revealed publicly in the hrst stage. This 
assumption plays a crucial role in the concept of incentive compatibility we use in this paper. 
We discuss this after the dehnition of incentive compatibility briehy and in detail in the next 
section. After the allocation, the agents observe their valuations Uj(a(0t), 6*t)’s, and report 
hint’s to the designer. The payment decision is made after this second stage of reporting. Our 
dehnition of incentive compatibility is accordingly modihed for a two stage mechanism. 

Due to SMTT and the inhnite horizon of the MDP, we will focus only on stationary 
mechanisms, that give a stationary allocation and payment to the agents in each round of 
the dynamic game. Let us denote a typical two-stage dynamic mechanism by M = {a,p). The 
function a : © —)■ A yields an allocation for a reported type prohle 6^ in round t. Depending 
on the reported types in the hrst stage, the mechanism designer decides the allocation a{6t), 
due to which agent i experiences a valuation of Vi{a{9t), 9t) in round t. Let us suppose that 
in the second stage, the reported value vector is given by Vt. The payment function p is 
a vector where Pi{9t,Vt) is the payment received by agent i at instant t. Combining the 
value and payment in each round we can write the expected discounted utility of agent i in 
the quasi-linear setting, denoted by {9t)Vt\9t), when the true type vector is 9t and the 
reported type and value vectors are 9t and Vt respectively. This utility has two parts: (a) the 


7 















current round utility, and (b) expectation over the future round utilities. The expectation 
over the future rounds is taken on the true types. Thus the effect of manipulation is limited 
only to the current round in this utility expression. This is enough to consider due to the 
single deviation principle of Blackwell (1965). 


= Vi{a{9t),9t) +Pi{9t,vt)+'E^,9t 


current round utility 


\vi{a{9s),9s) + Pi{9s,Vs)) 


( 6 ) 


expected discounted future utility 


Here tt denotes the stationary policy of actions, (a(-), a(-),...). For the SMTT, the type 
evolution is dependent on only the current type profile and action. To avoid confusion, we 
will use TT, a{9t), or a{9s), s >t + 1, according to the context. 

Equipped with this notation, we can now define incentive compatibility. 


Definition 5 (w.p. EPIC) A mechanism M = {a,p) is within period Ex-post Incentive 
Compatible (w.p. EPIC) if for all agents i E N, for all possible true types 9t, for all reported 
types 9i^t, for all reported values Vi^t, and for all t, 

uf^{9t, {vi{a{9t), 9t),v_i{a{9t),9t))\9t) 

> uf{{9^, 9-i^t), {\t, v_i{a{9i^t, 9_i^t),9t))\9t) 


That is, reporting the types and valuations in the two stages truthfully is an ex-post Nash 
equilibrium. We use ‘ex-post’ to denote that the agent chooses her action after observing her 
own type and valuation, and not the types of others, since that is not revealed to her according 
to the mechanism considered here. ^ The reported valuation Vi^t is therefore a function of 
the types 9i^t and 9i^t and not of either and 9-i^t, according to the assumption above. 
An interesting question would be: what happens when the agents’ type reports in the first 
stage are made public. The agents’ valuation reports in the second stage can then depend 
on the type reports in the first stage. The appropriate equilibrium concept in that setting 
is the subgame perfect eguilibrium. We present a detailed discussion on the implications of 
revealing the type reports in the first stage after presenting the proposed mechanism in the 
next section. 

In this context, individual rationality is dehned as follows. 


^Some readers may interpret the term ‘ex-post’ differently, since the term is conventionally used in the 
context of single stage mechanisms, i.e., where the decisions of allocation and transfer are decided simultane¬ 
ously (see, e.g., Jehiel et al. (2006)) and it denotes that truthful reporting is optimal for every realization of 
the other agents’ types even if the agent knew the other agents’ types. In the context of two-stage mechanisms 
that we consider here, we feel that this equilibrium of full observability can be better called as ‘subgame 
perfect’ equilibrium. This is the equilibrium concept used in the static two stage mechanism by Mezzetti 
(2004), and we discuss in detail the difference of the two equilibria concepts in the next section. 









Definition 6 (w.p. EPIR) A mechanism M = {a,p) is within period Ex-post Individu¬ 
ally Rational (w.p. EPIR) if for all agents i E N, for all possible true types 6t and for all 
t, 

uf{et,{vi{a{9t),et),v_i{a{et),9t))\0t) > 0 . 

That is, reporting the types and valuations in the two stages truthfully yields non-negative 
expected utility. 

3 The MATRIX Mechanism under SA and PIV 

In the interdependent valuation setting, our goal is to design a mechanism which is efficient 
(Def. 4), w.p. EPIC (Def. 5), and w.p. EPIR (Def. 6). This is non-trivial because to achieve 
efficient allocation in a dynamic setting, one needs to consider the expected future evolution 
of the types of the agents, which would reflect in the allocation and payment decisions, 
and for this reason a fixed payment mechanism or a repeated VCG fails to satisfy efficiency 
(Def. 4). The value interdependency among the agents plays a crucial role here. Even in a 
static interdependent value setting, if the allocation and payment are decided simultaneously, 
one cannot guarantee efficiency and incentive compatibility together (Jehiel and Moldovanu, 
2001). One way out is to split the decision of allocation and payment in two stages (Mezzetti, 
2004). 

Following this observation, we propose MDP-based Allocation and TRansfer in 
Interdependent-valued eXchange economies (MATRIX), which we prove to satisfy EFF, w.p. 
EPIC and w.p. EPIR under the restricted setting of SA and PIV. 

Given the dynamics of the game, illustrated in Figure 1, the agents report their types 
in the hrst stage, and then the allocation is decided. In the second stage, they report 
their experienced values and the payment is decided. The task of the mechanism designer, 
therefore, is to design the allocation and payment rules (a,p) in each time instant. 

In the context of SA and PIV, the social welfare given by Eq. (4) is modihed as follows. 


W{9t) 


max 

TT ’ 


,s=t j^N 


max Ea,9t 

asA 


^ ^ ^a) + hEgj_|_j^|(j,etIV(0f+i) 


(7) 


We also dehne the maximum social welfare excluding agent i to be IT_j(0_j_i), which is 
the same as Eq. (4) except now the sum of the valuations and the allocations are over all 
agents j ^ i. We also use the set of allocations excluding i to be A_i as dehned by SA, 




= max 

a-i€A-i 


jeN\{i} 


( 8 ) 
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Note that, SA and PIV are crucial for defining this quantity. Also, when i is absent, the 
following two notations are equivalent: since the type of i 

will be unchanged when she is not in the game. However, we adopt the former for consistency 
in notation. Using the dehnitions above and in the previous section, now we formally present 
MATRIX. 

Mechanism 1 (MATRIX) Given the reported type profile 9t in stage 1, choose the agents 
a*{9t) as follows. 


a*{9t) E argmaxE^ 


aeA 


'a fit 


LiGV 


and transfer to agent i after agents report vt in stage 2, a payment of, 

pm,it) = (E 

V J 


- [Vi,t-Vi{a*{9t),9^,^g^^) 


(9) 


( 10 ) 


The last quadratic term in 
the above equation is agent 

Ps penalty of not being con- Algorithm 1 MATRIX 
sistent with the hrst stage re¬ 
port. The intuition of charg¬ 
ing a penalty is to make sure 
that agent i be consistent with 
her reported type 9i^t in the 
hrst stage and her value re¬ 
port Vi^t in the second stage, 
given that others are reporting 
their types and values truth¬ 
fully. We will argue that when 
all agents other than agent i 
reports their types and val¬ 
ues truthfully in the two stages 
of the mechanism, it is the 
best response for agent i to do 
so as well. This term distin¬ 
guishes our mechanism from 
that given by Mezzetti (2004), where the agents are weakly indifferent between reporting 
true and false values in the second stage. We summarize the dynamics of MATRIX using an 
algorithmic howchart in Algorithm 1. 


for all time instants t do 

Stage 1: 

for agents i = 0,1 ,..., n do 
agent i observes 9i^t; 
agent i reports 9i^t] 

end for 

compute allocation a*{9t) according to Eq. 9; 

Stage 2: 

for agents i = 0,1 ,..., n do 

agent i observes ^aqe^)); 

agent i reports 

end for 

compute payment to agent z, p*{9t,Vt), Eq. 10; 

types evolve 9t —)■ 9t^i according to SMTT; 

end for 
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We have used this quadratic term for the ease of exposition. However, it is easy to 
show that any non-negative function g{x,i) having the property that g{x,i) = 0 x = £ 
would still satisfy the claims made in this paper. Nath and Zoeter (2013) use a similar 
term to ensure strict truthfulness in the second stage of a two stage static mechanism with 
interdependent valuations. 

MATRIX AND SuBGAME PERFECTION Since this paper is an extension of the results of 
Nath and Zoeter (2013) to a dynamic type setting, we can do similar comparisons of prop¬ 
erties with the mechanism of Mezzetti (2004) (let us call this the classic mechanism). If we 
consider the case where the first stage type reports are made public by the mechanism, i.e., 
observable by all agents, then the agents have a chance of modifying their next stage report 
depending on that information. The concept of truthfulness should be modified to subgame 
perfect equilibrium in this context, which ensures that truth-telling is an equilibrium in every 
subgame of the two stage game. It can be shown that an agent i can misreport her type 
in the first stage from 6 *j to 9i when other agents are reporting their types truthfully and in 
this subgame, since the reported types are public, each agent’s best response would be to 
report valuations consistent with the first stage’s reported types Tj(a*( 0 t), 6 *^*^^^^) (and not 

the true valuations Vi{a*{9t), 6 *„*( 0 q)), which results in more utility to agent i than reporting 
types truthfully in the first stage (see Nath and Zoeter (2013), where Example 1 illustrates 
this and can be modified in the dynamic setting for a similar conclusion). Hence, if the first 
stage reports are made public, MATRIX does not ensure truthfulness in a subgame perfect 
equilibrium. The classic mechanism, on the other hand, continues to satisfy truth-telling 
in a subgame perfect equilibrium even in this complete information scenario, and this is 
because the utility of the agent is unaffected by her second stage valuation reports. So, 
to summarize, in the incomplete information setting, MATRIX provides a strict truthfulness 
guarantee in the second stage and the truthfulness is in an ex-post Nash equilibrium, but in 
a complete information setting, it does not ensure truthfulness in a subgame perfect equilib¬ 
rium, while the classic mechanism is not strictly truthful in an ex-post Nash equilibrium for 
an incomplete information setting, but is weakly truthful in a subgame perfect equilibrium 
in the complete information setting. It is important to note that even though the classic 
mechanism is weakly truthful in the second stage, and every agent’s utility is unaffected by 
their valuation report, the truthfulness in the type reports in the first stage requires that the 
agents be truthful in the second stage. Hence, one needs to assume in the mechanism by 
Mezzetti (2004) that the agents report their valuations truthfully even when their utilities 
are unaffected by their reports. 

3.1 Efficiency and incentive compatibility 

The following theorem shows that MATRIX satishes two desirable properties in the unrestricted 
setting. 

Theorem 1 Under SMTT, with SA and PIV, MATRIX is EFF and w.p. EPIC. In addition, 
the second stage of MATRIX is strictly EPIC. 
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MATRIX is a two stage mechanism, and we need to ensure that truth-telling is a best 
response in both these stages. 

Proof: Clearly, given true reported types, the allocation of MATRIX is efficient by Dehni- 
tion 4. Hence, we need to show only that MATRIX is w.p. EPIC. 

To show that MATRIX is w.p. EPIC, let us assume that the true type profile at time t is 
9ti and all agents j ^ i report their true types and values in each round s = -|- 1, • • ■ 

etc. Only agent i reports 9i^t and Vi^t in the two stages. Therefore, 9t = and 

for all j 7 ^ i. Using the single deviation principle (Blackwell, 1965), 
we conclude that it is enough to consider only a single shot deviation from the true report 
of the type. Hence, without loss of generality, let us assume that agent i deviates only in 
round t of this game. 

Let us write down the discounted utility to agent i at time t. 

= Vi{a*{9t),9^,^g^-^)+p*{9t,vt) 


current round utility 




Y, S-\v,(a"V.).K-{e.))+p:(e.,v,)) 


_s=t+l 


expected discounted future utility 


E, 


'Tr*,6t 




s=t+l 


We use the shorthand tt* to denote the allocation policy under MATRIX. This gives rise 
to the allocations a(-) in each round given the type prohles (either reported or true). The 
first equality is from Eq. (6). The second equality comes by substituting the expression of 
payment from Eq. (10). 
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Now, from the previous discussion on the Vj^t^ and j ^ we get, 
e-i.,), (««, e-if), 

= Vi{a (^t), ^a»(ep) + (0*), 


+ IE7r*,0t 


X] 6^-\v,{a*{e.),ea*ie.))+p:{e,,Vs)) 


_s=t+l 


< Vi{a (^0) ^a*( 0 p) + (^t)) ^a*(et)) + 

j¥^i 


-w.i{e.,,t) + E^*, 


0 t 


h*-*(n.(a*(0.),0„*(e.))+p:(0.,n.)) 


s=t+l 


( 11 ) 


The equality comes because of the assumption that all agents j ^ i report their types 
and values truthfully. The inequality is because we are ignoring a non-positive term. Now, 
let us consider the last term of the above equation. 


E. 


'■n* ,6t 




s=t-\-l 


s=t-\-l 


j¥=i 




s=t-|-l iGV 


The first equality comes from Eq. (10). We can now rearrange the expectation for the hrst 
term above using the Markov property of 9t that gives, gjE^*^ 0 j_^J-]]. 

Therefore, 
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E. 


'TT* fit 


Y, + p:(e„v,)) 


.s=f+l 


= E, 


'et+t\a*(et)fit 


s=f+l j&N 


E. 


T^* fit 


E i5'-‘ - wfie.,_.)) 


s=t-\-l 


= (W(9,+i)) 


+ ^TT*fit 


- w.,{e.i,s)) 


s=t-\-l 


( 12 ) 


The last equality comes from the dehnition of W{6t+i). Let us now focus on the last 
term of the above equation. 


E 


Tt* fit 





Combining Equations 11, 12, and 13, we get, 

< Vi{a (^t), (^t), 


(13) 






(14) 


We also note that. 


This is because when i is removed from the system in SA domain (while computing 
W-i{6-i^t+i))-i the values of none of the other agents will depend on the type Oi^t+i, due to 
PIV. And due to the independence of type transitions, i’s reported type 9i^t can only influence 
6i^t+i- Hence, the reported value of agent i at t, i.e., 9i^t cannot affect W_i(0_i,t+i). 
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Hence, Equation 14 can be rewritten to show the following inequality. 

j¥=i 

(from Eq. 15) 

= 5^o,(o-(9,),«„.,^,,) + I, „,»'(«,+.) - 

j&N 


j&N 

(by dehnition of a*{9t), Eq. 9) 


(16) 


This shows that utility of agent i is maximized when 9i^t = 9i^t and Vt^t = Vi{a*{9t),9a*(et))- 
This proves that MATRIX is within period ex-post incentive compatible. 

We now argue that the second stage is strictly EPIC for an agent i. This happens because 

of the quadratic penalty term ^ — Vi{a*{9t), j in the payment p* (Eq. (10)). Notice 

that if all the agents except i report the types and values truthfully, and agent i also reports 
her type truthfully in the hrst stage, then the penalty term will always penalize her if Vi^t 
is different from Vi{a*{9t),9^,^fj^.^), which is her true valuation. Hence, the best response of 
agent i would be to report the true values in the second stage, which makes MATRIX strictly 
EPIC in this stage. ■ 


3.2 Why a dynamic pivot mechanism would not work in this set¬ 
ting 

It is interesting to note that, if we tried to use the dynamic pivot mechanism (DPM), 
(Bergemann and Valimaki, 2010), unmodihed in this setting, the true type prohle 9t in the 
hrst summation of Eq. (14) would have been replaced by 9t, since this comes from the 
payment term (Eq. (10)). The proof for the DPM relies on the private value assumption (see 
the beginning of Section 2 for a dehnition) such that, when reasoning about the valuations 
for the other agents j ^ z, we have Vj{a*{{9i^t, ^-i,t)), (^i,t, = Vj{a*{9t), 9j^t), with which 

the EPIC claim of DPM can be shown. But in the interdependent value setting, we cannot 
do such a substitution, and hence the proof of EPIC in DPM does not work. We have to 
invoke the second stage of value reporting in order to satisfy the EPIC. 

3.3 Ex-post individual rationality 

With SA and PIV, we now show that MATRIX is individually rational. 

Theorem 2 (Individual Rationality) Under SMTT, with SA and PIV, MATRIX is w.p. 
EPIR. 
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Proof: Due to SA, the set of allocations excluding agent i, denoted by A_i = jg 

already contained in the set of allocations including i, denoted by A = 2^. Formally, this 
means a_j G A_i A 3 a. Therefore, the policies 7r_j G A°f^ C A°° 3 vr. Hence in the 
ex-post Nash equilibrium, the utility of agent i is given by, 

{vMOt),o,),v_MOt),0t))\et) 

= + SEg^^,la*ie,)AW{et+l) - PF-.(0-*,t) 

j&N 

= W{et) - 

> 0 . 

The first equality comes from the last equality in Equation 16 and the second equality is by 
dehnition of W{9t) and a*{6t)- The last inequality is an immediate consequence of SA and 
PIV, as the allocation that maximizes the social welfare excluding agent i is already in the 
potential allocations when i is present. This proves that MATRIX is within period ex-post 
individually rational. 


3.4 Complexity of computing the allocation and payment 

The non-strategic version of the resource to task assignment problem was that of solving an 
MDP, whose complexity was polynomial in the size of state-space (Ye, 2005). Interestingly, 
for the proposed mechanism, the allocation and payment decisions are also solutions of MDPs 
(Equations 9, 10). Hence the proposed mechanism MATRIX has polynomial time complexity 
in the number of agents and size of the state-space, which is the same as that of the dynamic 
pivot mechanism (Bergemann and Valimaki, 2010). 

4 Discussions on a General Result 

We can generalize the assumptions of SA and PIV in the following way that would result 
in the same conclusions as in this paper. These definitions also serve to show the minimal 
requirements of the proofs. 

Consider a set of all possible allocations denoted by A. The valuations are called inde¬ 
pendent of irrelevant agents (IIA) with respect to a set of allocations A C A if for all i & N, 
3 A_i C A s.t. V a_j G A_j, 

Vj{a_i,9) = Vj{a_i,9_i) 

Vi{a-i,9) = 0 

SA and PIV together constitute a special case of HA valuations. However, there exist 
not-so-restrictive examples as well. Consider a set of agents Y = {l,2,...,n} having types 
9i, 92, ■ ■ ■ ,9n and a dummy agent D who does not have any type. Let A = 

A = 2^. We define A_j = ^ihe power set of agents where the dummy replaces 

agent i. Since the dummy does not have any type, the valuations of other agents after 
replacing agent i with D depends only on 9-i. Note, in particular, that A_j ^ A. 
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If now, in addition, A_i C A, then the allocations are called monotone. SA is a monotone 
set of allocations and PIV is IIA over that. 

We can show that Theorem 1 extends with IIA valuations and Theorem 2 extends with 
IIA valuations with respect to monotone allocations. We omit the proofs since they follow 
identical arguments. 

5 Conclusions and Future Work 

This paper provides a first attempt of designing a dynamic mechanism that is strict ex-post 
incentive compatible and efficient in an interdependent value setting with Markovian type 
evolution. In a restricted domain, which appears often in real-world scenarios, we show that 
our mechanism is ex-post individually rational as well. This mechanism, MATRIX, extends 
the mechanism proposed by Mezzetti (2004) to a dynamic setting and connects it to the 
mechanism proposed by Bergemann and Valimaki (2010). 

We have discussed the interesting and challenging domain of mechanism design with 
dynamically varying types and interdependent valuations. There has been very little work 
where dynamic types and interdependent values have been addressed together. Hence, there 
is very little known on the limits of achievable properties in this domain. We have provided 
one mechanism, namely MATRIX, that is w.p. EPIC, strict in the second stage, and under 
a restricted domain, even w.p. EPIR. However, we do not know what mechanism charac¬ 
terizes those properties in this domain. For example, a question that may arise is “Is this 
the only efficient dynamic mechanism that satishes strict w.p. EPIC in an interdependent 
value setting?”. For the static setting with independent values we have the Green-Laffont 
characterization result that answers this question for efficiency and DSIC. However, such a 
characterization result is absent for interdependent valuations for both static and dynamic 
mechanisms. Developing such a full characterization would be worthwhile. 
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